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Molecular crowding is a critical feature distinguishing intracellular environments from idealized 
solution-based environments and is essential to understanding numerous biochemical reactions, from 
protein folding to signal transduction. Many biochemical reactions are dramatically altered by crowding, yet 
it is extremely difficult to predict how crowding will quantitatively affect any particular reaction systems. We 
previously developed a novel stochastic off-lattice model to efficiently simulate binding reactions across 
wide parameter ranges in various crowded conditions. We now show that a polynomial regression model can 
incorporate several interrelated parameters influencing chemistry under crowded conditions. The unified 
model of binding equilibria accurately reproduces the results of particle simulations over a broad range of 
variation of six physical parameters that collectively yield a complicated, non-linear crowding effect. The 
work represents an important step toward the long-term goal of computationally tractable predictive 
models of reaction chemistry in the cellular environment. 

The intracellular environment is a densely concentrated region packed with a mixture of numerous types of 
macromolecules and subcellular structures 1 . Because of this dense crowding, biochemical reactions in vivo 
behave significantly differently from the same reactions in the well-mixed and dilute conditions of typical 
in vitro models'. Molecular crowding, one of critical features of the intracellular environment, affects many fun- 
damental cellular processes, such as protein folding 2 , protein aggregation 3 , enzyme activity 4 , reaction kinetics 5 , and 
signal transduction 6 . Moreover, the relative size and shape of crowding agents are also crucial parameters in 
molecular crowding 7,8 .The specific effects of molecular crowding on any particular system, however, are not easily 
predicted even given detailed knowledge of the specific physical parameters of the given reaction system 9 ' 11 . 
Therefore, quantitatively characterizing how crowding influences any given reaction system is a very challenging 
problem using in vivo or in vitro model systems. Computational modeling and simulation methods provide 
alternative ways to estimate the effects of individual parameters singly or in combination because these methods 
allow for us to easily and precisely vary reaction system parameters separately and in combination, a capability 
difficult to match in any real experimental system. Prior computational models, however, have significant limitations 
with respect to simulating model reaction systems under crowded conditions. Ordinary differential equation (ODE) 
models generally assume a well-mixed and diluted solution and thus ignore the crowding effect. Partial differential 
equation (PDE) models can include spatial constraints but also provide no explicit basis for modeling the crowding 
effect specifically 12 . Particle-based simulations provide a way to more directly address the molecular crowding effect 
and explore how many parameter changes might influence it. Lattice-based models provide a computationally 
efficient means of performing such simulations but require simplified models of particle movement and structures 
and are therefore prone to exaggerating the crowding effect 13 . Off-lattice models based on Brownian or Langevin 
dynamics provide for more realistic simulations but at a high computational cost for highly crowded conditions 14,15 , 
making them unsuitable for modeling crowding for large numbers of particles or long time scales. There is conse- 
quently no method that has both the realism to accurately and quantitatively simulate reactions in crowded media 
and the efficiency to do so for sufficiently large system sizes and time scales to model many biological reactions. 

In order to overcome the inherent limitations of the various methods available, we propose a novel strategy 
intended to yield both efficiency and accuracy in modeling chemistry in crowded media. We accomplish this goal 
by using relatively computationally costly particle simulations on simple test systems to determine how multiple 
parameters act singly and in combination to influence the crowding effect and then fit an easily-applied regression 
model to outputs of these simulations for use in quickly making crowding corrections to more efficient simulation 
methods. In previous work, developed a coarse-grained two-dimensional stochastic off-lattice particle simulation 
(2DSOLM) 16 based on Green's function reaction dynamics 17 for simulating binding kinetics in various molecular 
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crowding conditions. We subsequently showed that one can simplify 
the problem of modeling of the crowding effect by identifying a 
subset of "separable" parameters, whose effects one could learn inde- 
pendently and then merge into a model of their collective effects 18 . 
These four parameters in 2DSOLM are the total concentration (C); 
the probability of binding upon a collision between two reactant 
monomers (B); the mean time for dissociation event (M), defined 
as the inverse of the rate constant; and the diffusion coefficient for 
reactants and inert particles (D). The work showed that such an 
approach can simplify the problem of quantitatively modeling reac- 
tions in crowded media by allowing one to independently account for 
the influence of several separable parameters. It did not, however, 
resolve the inherent problem that the crowding effect depends on 
non-linear interactions of multiple physical parameters. 

Here, we generalize the prior approach to account for both 
separable and inseparable parameters by creating a collective 
model accounting for the prior parameters and two additional para- 
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one another and from that of the total concentration (C). Figure 1 
illustrates these two cross-dependent interaction parameters. The 
parameter a captures the change in excluded volume induced by bind- 
ing between two reactant monomers. Fig. la illustrates three different 
scenarios after the reaction: a<2, a=2, and a>2. The first case repre- 
sents a decrease in the area occupancy of two reactants, as might occur 
if one molecule docks into a solvent-inaccessible binding pocket of 
another molecule or the two bind tightly along two rough faces whose 
solvent- accessible areas were significantly larger than their van der 
Waals areas. The second case represents an unchanged area occupancy 
of the product relative to the two separate reactants. The third case 
represents an increase in area occupancy of the product binding, as 
may occur if the proteins bind so as to establish some new solvent- 
inaccessible cavities. This parameter would be expected to influence 
crowding due to an entropic preference for reducing excluded volume 
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Figure 1 | Illustrations of the 2DSOLM reaction model for two different 
cross-dependent interaction parameters. The parameters a. and /? 
correspond to sizes of reactant dimer and crowding agent relative to 
reactant monomer. Cyan circles are reactant monomers, magenta circles 
are reactant dimers, and open circles are inert crowding agents, (a) 
Illustration of varying a. (b) Illustration of varying /?. 



in conditions of high crowding. We would thus expect crowding to 
tend to drive binding for a<2 and inhibit binding for a>2. 

The parameter \] represents another cross-dependent property 
describing the interaction between reactants and inert particles and 
specifically influencing how effectively inert particles provide steric 
hindrance to reactants. Three different cases are illustrated in Fig. lb : 
/?<1, p = l, and jg>l for 2DSOLM. The first case corresponds 
to individual inert particles smaller than reactant monomers, the 
second to inert and reactant particles of equal size, and the third to 
inert particles larger than reactant monomers. Unfolding experi- 
ments on ubiquitin 7 have shown that reducing the size of an inert 
crowding agent can lead to a stronger crowding effect. However, such 
anecdotal observation provides little basis for quantitatively estimat- 
ing how the relative size of crowding agents to reactants will affect 
any given the binding reaction system, especially in a background of 
potentially multiple other parameter changes. For example, while 
scaled particle theory in principle allows one to make such inferences 
for some isolated parameter changes, it is difficult to generalize 
to more complicated scenarios. In the present work, we develop a 
unified model accounting for both the previously separable and the 
inseparable parameters of the simulation, which we refer to as the 
unified regression model. We show that it is possible to accommod- 
ate the three inseparable parameters (a,/?,C) in a single model 
through a multidimensional polynomial regression model and 
validate this model over variations in pairs of parameters. In addi- 
tion, we validate our model of these three inseparable parameters by 
comparison with estimates from scaled particle theories 1 "' 20 . We then 
show that it is possible to produce a full unified regression model 
combining these multidimensional regression models with the prev- 
iously derived separable components. Finally, we show that the uni- 
fied model provides an accurate description of random changes in all 
of the simulation model's physical parameters over a broad range of 
biologically relevant values in all parameters. In the process, we show 
a way to build a model with nearly equivalent predictive power to the 
costly particle simulations, but capable of easy evaluation for use in 
correcting computationally efficient simulations of large systems and 
long time scales. These findings have ramifications for our ability to 
computational model a diversity of cellular functions that are likely to 
be heavily influenced by intracellular crowding in two dimensional 
space, such as protein synthesis, signal transduction, or cytoskeleton 
assembly and disassembly. 

Results 

Simulation in 2DSOLM. We investigated the parameter dependence 
of binding reactions in crowded media by using a two-dimensional 
model of dimerization of a reactant monomer of radius 2.5nm simu- 
lated in a lOOnm X lOOnm space with a hard reflective boundary 
condition. This simple reaction model allowed us to focus on the 
specific issue of binding equilibrium in a setting tractable for exploring 
a broad parameter space with sufficientiy large particle numbers and 
numbers of replicates to produce reproducible results. We began with 
our prior regression model expressing equilibrium constant as a multi- 
plicative function of the effects of the four separable parameters: 
C (total concentration), B (binding probability), M (inverse dissoci- 
ation rate), and D (diffusion coefficient) 18 . We then added the two 
cross-dependent physical parameters, a. and /?, which produce crowd- 
ing effects inseparable from one another and from C. To extend the 
prior regression model to the parameters a, and [>, we first established a 
baseline simulation parameter set with default parameter values of 
B=0.7, M=lns, D=6.95Xl(T 11 mV 1 , x=2, and j8=l, values chosen 
based on our prior simulation studies 1618 to produce a reasonably 
strong crowding effect as well as to approximate the temperature 
and viscosity conditions of the cytoplasm 21,22 . We fixed one additional 
simulation parameter, c?, h , describing a threshold maximum distance 
at which two particles can interact with one another, to be c/ t h=0.5nm 
(one fifth of the radius of a reactant monomer). In SOLM simulations, 
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all particles are diffusible and are separated by at least a distance of d th . 
When the simulation places two reactant monomers within a distance 
of d t h °f one another, they must either undergo a binding reaction or a 
collision event, with probability B of binding or 1 - B of collision 16 . One 
reactant and one inert particle or two inert particles that are placed 
within d t h of one another always undergo a collision event 16 . A binding 
event will result in the formation of a new dimer, while a collision will 
result in the particle positions being resampled to be beyond d th . The 
choice of d t ^ influences the physical model of the simulator, in part 
because d^ is effectively a radius at which particles are assumed close 
enough to exert non-bonded forces on one another and initiate a 
possible binding interaction and in part because it imposes an upper 
limit on the maximum possible crowding level in the simulation. It 
also influences computational efficiency, as small d th can lead to larger 
numbers of events when particles are in close proximity. The default 
d t h value was chosen empirically to permit physiologically reasonable 
levels of crowding cases (up to 0.45 concentration) while still yielding 
efficient simulations. 

We then separately varied the individual parameter values for a 
and /? in conjunction with variations in total concentration using the 
default values for the other parameters to test these cross-dependent 
parameter effects on the binding chemistry. We simulated ten a 
values (1.4, 1.6, 1.8, 2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2) and seven /? values 
(0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6). For different a value simulations, we 
simulated eight total concentrations (0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 



0.45) with dimensionless units of fraction of total simulation area 
occupied by particles. To produce varying concentrations, we began 
with a fixed concentration of reactants of 0.1 and then added inert 
crowding particles to yield each higher value of total concentration 
(e.g., C=0.1 corresponds to purely reactant monomers while C=0.25 
corresponds to concentration 0.1 of reactant monomers and 0.15 of 
inert crowding agents). For simulations of varying ji, we simulated 
seven total concentrations (0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4) again 
using the fixed reactant concentration of 0. 1 and varying inert mono- 
mer concentrations, similar to the a simulations. Each set of para- 
meters was run for 25us with 30 repetitions, with progress recorded 
every 0.15625us. SOLM is a discrete event-driven method and thus 
has no fixed time step, but rather jumps between discrete changes in 
system states with randomly sampled waiting times between events 16 . 
The time interval to collect data on simulation progress only affects 
visualization of results, not the actual progress of the simulation, 
and was chosen to provide sufficient resolution to clearly display 
transient behavior early in the curve and stochastic fluctuations at 
long time scales. Data was collected at five different simulation times 
(5, 10, 15, 20, 25us) to analyze the crowding effect on the test reac- 
tion, with values selected based on our prior work 18 . For each con- 
dition, we measured reaction progress by the mean number of dimers 
as a function of time across all simulations. 

Figure 2 shows the simulation results for varying values 
of parameters a. and /?. Fig. 2a shows mean dimer counts at 
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Figure 2 | Binding equilibrium as a function of concentration for varying a and ft. Cyan circles represent reactant monomers, magenta circles represent 
reactant dimers, and open circles represent inert crowding particles. Embedded screen snapshots show an illustrative quasi-equilibrium state for the 
maximum feasible level of crowding for each curve set. (a) Dimer counts for y.= 1.4(top), 1.6, 1.8,2.0, 2.2, 2.4, 2.6, 2.8, 3.0, 3.2(bottom). Inset image shows 
a snapshot for C=0.45 ando<=1.4. (b) Equilibrium constants for a=1.4(top), 1.6, 1.8, 2.0, 2.2,2.4, 2.6, 2.8, 3.0, 3.2(bottom). Inset image shows a snapshot 
for C=0.45 and 2 = 3.2. (c) Dimer counts for /?=0.4(red, top), 0.6(green), 0.8(blue), l.O(cyan), 1.2(magenta), 1.4(black), 1.6(gray, bottom). Inset image 
shows a snapshot for C=0.4 and /?=0.4. (d) Equilibrium constants for /?=0.4(red, top), 0.6(green), 0.8(blue), l.O(cyan), 1.2(magenta), 1.4(black), 
1.6(gray, bottom). Inset image shows a snapshot for C=0.4 and /?= 1.6. 
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quasi-equilibrium for varying a values. As the parameter value a 
decreases, the number of dimers at equilibrium rapidly increases, 
especially for the high crowding cases. Interestingly, there is only a 
slight difference between dimer counts at very low crowding levels 
but the separation between the equilibrium counts rapidly increases 
with increasing crowding level. The K eq curve, calculated from simu- 
lation data, more clearly shows how strongly the parameter a influ- 
ences the model reaction system, as shown in Fig. 2b. These results 
show that the parameter a and the parameter C are cross-dependent 
and inseparable. While the size of a dimer does not significantly 
influence the binding reaction under low crowding conditions, a 
smaller dimer is much more conducive to binding than a larger 
dimer under densely crowded conditions. The embedded images in 
Fig. 2a and 2b show snapshots of the simulator at the quasi-equilib- 
rium state (25(.is) for a= 1.4 and a = 3.2 respectively at under a highly 
crowded condition (C=0.45) in 2DSOLM. Fig. 2c shows mean dimer 
counts at the quasi-equilibrium state based on simulation data for 
varying fi. As the parameter value [> decreases, the number of dimers 
at the equilibrium state rapidly increases, especially for high crowd- 
ing cases. Similar to the parameter a case, variations in /? produce 
little change at low crowding levels, but the difference rapidly 
increases as the crowding level increases. The K eq curve, calculated 
from simulation data, shows how strongly the parameter fS influences 
the model reaction system, as shown in Fig. 2d. These results show 
that the parameter ji and the parameter C are cross-dependent and 
inseparable. While the size of an inert crowding agent does not 
significantly influence the binding reaction under low crowding con- 
ditions, binding is much more favorable in the presence of a smaller 
versus a larger crowding agent under densely crowded conditions. 
The embedded images in Fig. 2c and 2d show snapshots of the 
simulator at the quasi-equilibrium state (25|is) for /?=0.4 and 
j8=1.6 respectively at a highly crowded condition (C=0.4) in 
2DSOLM. 

Unified regression models. We developed three different polynomial 
regression models: K eq (C,a) with default fi value ( = 1), K eq (C,fi) with 
default a value (=2), and K eq (C,oi,P). To find the best-fit degree for 
each model, we applied leave-one-out cross validation to models of 
varying degree derived from the simulation data. For K eq (C,oi), a total 
of 80 different simulation data points was collected, making 11* degree 
the highest possible before the number of regression coefficients 
exceeds the number of data points. For K eq (C,P), a total of 49 different 
simulation data points was collected, making 8 th degree the highest 
possible. For K eq {C,ct.,j5), a total of 122 different simulation data points 
was collected, making 7* degree the maximum possible. Figure 3 
shows the root mean square errors of the leave-one-out cross valid- 
ation for K eq (C,ci), K eq (C,P), and K eq (C,a.,P), respectively. Sixth degree 
polynomials were selected as the best-fit models after cross validation. 
The resulting best-fit regression models for K eq (C,a), K eq (C,fi), and 
K eq (C,x,[!) are provided in equations (1-3). 



K eq (C,a.) 



: [(-1.441) + (1.969C + 3.751c 



+ (- 7.5Ca + 17.326C 2 - 3.712a 2 ) 

+ (4.204Ca 2 + 24.718C 2 a- 163.106C 3 + 1.943a 3 ) 

+ ( - 1.029Ca 3 - 13.071CV - 31.65C 3 a + 572.072C 4 - 0.566a 4 ) (1) 

+ (0.217Ca 4 -0.048C 2 a 3 +44.2C 3 a 2 -167.569C 4 a-533.094C 5 

+ 0.085a 5 ) + ( - 0.044Ca 5 + 0.801C 2 a 4 - 12.484C 3 a 3 

+ 54.042C 4 a 2 - 88.673C 5 a + 462.022C 6 - 0.005a 6 )] 

K eq (C,fS) = 10' 15 x [(-0.99) + (l2.39C + 4.43/?) 
+ ( -22.25C/i-95.67C 2 - 10.55/? 2 ) 
+ (30.73C/i 2 + 84.2C7i + 424.36C 3 + 13.56 / 3 3 ) 
+ (-31.3C/i 3 -12.66C 2 « 2 

(2) 

- 345. 58C 3 /? - 896.2C 4 - 9.3/? 4 ) 

+ (18.33C/5 4 -20.06C 2 /i 3 + 116. 16C 3 ^ 2 + 523.21C 4 /J + 852.66C 5 + 3.13/J 5 ) 
+ ( - 4.84C/i 5 + 16.6C 2 /i 4 - 79.57C 3 /? 3 + 168.14C 4 /5 2 - 777.57C 5 /i 
+ 119.35C 6 -0.38/i 6 )] 



K e JC,a.„ 



-10- 



' x [(- 2.66) + (19.85C + 4.42a + 2.79/?) 



+ (-111.51C 2 -4.14a 2 -5.83/? 2 -14.42Ca-21.44C/?) 

+ (370. 09C 3 + 2.1a 3 + 6.71/? 3 + 7.21Ca 2 + 28.23C/? 2 

+ 55.44C 2 a + 88.32C 2 /?) + ( - 566.94C 4 - 0.6a 4 - 3.89/? 4 

- 103.33C 3 a - 366.94C 3 /?- 1.78Ca 3 - 29.21C/? 3 -21.42C 2 a 2 

(3) 

-11.53C 2 /? 2 ) + (702.26C 5 +0.09a 5 + 0.91 / 3 5 -83.81C 4 a 

+ 546.94C 4 /? + 0.33Ca 4 + 17.81C/f + 56.4C 3 a 2 + 126.84C 3 ^ 2 

+ 0.95C 2 a 3 - 23.24C 2 /? 3 ) + (285. 17C 6 - 0.005a 6 - 0.008/i 6 

- 137.49C 5 a - 777.57C 5 /? - 0.05Ca 5 - 4.84C/? 5 + 50. 16C 4 a 2 

+ 156.28C 4 /? 2 + 0.78C 2 a 4 + 17.39C 2 /? 4 - 13.57C 3 a 3 - 79.57C 3 j3 3 )] 

Surface plots of K eq values as functions of both parameters a and C 
are shown in Fig. SI and S2 for simulation data and for the best-fit 
polynomial regression models of degrees one through eleven. 
Similarly, the surface plots of K eq values as functions of both para- 
meters P and C are shown in Fig. S3 for simulation data and for the 
best-fit polynomial regression models of degrees one through eight. 
Finally, the unified regression models were built, which combined 
with previous regression model 18 , shown in Methods. 
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Figure 3 | Least-squares error rates for leave-one-out cross validation test for different polynomial degree, (a) K eq (y.,C). (b) K eq (fi,Q. (c) K eq (a,fi,Q. 
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Evaluation of the unified regression models. To evaluate three 
different unified regression models in equations (5-7 in Methods 
below) with the best-fit sixth degree polynomials (1-3), we randomly 
selected 30 test cases for each model with fixed reactant concentra- 
tion 0.1 and other parameters selected uniformly at random from the 
ranges C = 0.1 - 0.4, B = 0.1 - 0.9, M = 0.6 - 1.4ns, D = 1.95 - 
21.95 X lO^'mV 1 , a = 1.4 - 3.2, and [3 =0.4 - 1.6. For each test 
case, we compared mean dimer counts from the simulation to the 
estimated dimer counts predicted by the unified regression models. 
Simulation values were averages of over 30 repetitions per data point. 
As an additional test of the performance of the model, we conducted 
another 30 random trials for each model for a 200nm X 200nm 
boundary and a 400nm X 400nm boundary in order to explore 
whether boundary effects significantly impact the fit of the regression 
model. We similarly compared these simulation results with esti- 
mates from the regression model. Simulation values were averages 
of over 5 repetitions per data point. Figure 4 shows the comparison 
between the simulated values and estimated values from our regres- 
sion models using random parameter values from the full parameter 
space of each regression model. For the lOOnm X lOOnm cases in 
Fig. 4a, d, and g, our regression model closely matches the 
dimer counts derived from the particle simulations at the quasi- 
equilibrium state. The results suggest that one can use the regression 
model as a reliable and much faster replacement for the particle 
simulations for quantitatively predicting equilibrium constants over 
the parameter range examined. For the 200nm X 200nm cases in 
Fig. 4b, e, and h, and the 400nm X 400nm cases in Fig. 4c, f, and i, 
simulations and regression predictions still match very well, 
although the total particle number increases by four and sixteen 



times respectively, suggesting that boundary effects have minimal 
influence even for these relatively small simulations. The five-para- 
meter regression models (K eq (C,B,M,D,u.) with /?= 1 in Fig. 4a, b, and 
c, K eq (C,B,M,D,P) with a. = 2 in Fig. 4d, e, and f) accurately predicted 
the simulation dimer counts for simultaneous variation across the 
range of parameters previously examined singly or in pairs. The 
correspondence between the regression models and full simulations 
was somewhat worse for the full six parameter regression model 
(K eij (C,B,M,D,a,p) in Fig. 4g, h and i). Overall though, the correlation 
is strong for the vast majority of the cases, which supports the equi- 
valency of the regression and simulation predictions over a broad 
range of biologically relevant values for all parameters. 

Comparison with scaled particle theory. The influence of non-ideal 
interactions on chemical reactions can be estimated by thermo- 
dynamic theories 9-20 . The excluded volume effect that becomes sig- 
nificant under conditions of high molecular crowding is one of 
critical non-ideal interactions. A correction factor to the equilibrium 
constant for excluded volume can be approximately calculated, yield- 
ing a corrected activity coefficient of reactants and products: 
K eq = T exc K°, where K eq is an apparent equilibrium constant, K" is 
the equilibrium constant in the ideal state, and V exc is the correction 
factor for the exclusion effect from interactions among particles in 
the reaction system 9,20 . 

To validate our model, we calculated the correction factor and 
apparent equilibrium constant for various parameter conditions for 
the unified regression model in Eq. (3), 2DSOLM simulation, and 
scaled particle theory in two dimensional space (2DSPT) 19-20 We first 
ran additional simulations at an uncrowded 1% concentration case 
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Figure 4 | Comparison of dimer counts obtained from regression estimates vs. simulations. A data point on the diagonal line would indicate perfect 
agreement between the two values while points above the line show overestimates of simulation values and points below the line underestimates, (a) K eq { y, Q 
in lOOnm X lOOnm. (b) K eq (y,C) in200nm X 200nm. (c) K eq (u,Q in400nm X 400nm. (d) K eq (j5,C) in lOOnm X lOOnm. (e) K eq (fS,C) in200nm X 200nm. 
(f) K eq (P,Q in 400nm X 400nm. (g) K eq (y.,j3,C) in lOOnm X lOOnm. (h) K aJ (y,p,C) in 200nm X 200nm. (i) K eq (y.,p,Q in 400nm X 400nm. The 
concentration of reactants is fixed at 0.1 and the concentration of inert crowding agents is randomly selected from 0.0 to 0.3. The other parameters are 
uniformly randomly selected, with B from 0.1-0.9, M from 0.6-1.4ns, D from 1.95-21.95 X lO^'mV, a from 1.4 - 3.2, and ji from 0.4 - 1.6. 
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Table 1 | K° [counts^m 2 ] for two different parameters (/VI and a) 
from simulation results of 1 % concentration, pure reactants. All the 
other parameter values are set to default values (8=0.7, 
D=6.95xlO-"mV, j8=1, and d, h = 0.5nm). The data was 
collected at 5 time points (5, 10, 15, 20, 25ps) for 30 independ- 
ent runs in a 500nm X 500nm simulation area. 
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(C=0.01, pure reactants case) with default parameter values of 5= 0.7, 
Af=lns, D=6.95Xl(T 11 mV 1 ) a=2, 0=1, and d tA =0.5nm. We 
assumed that the interaction among particles at 1% concentration 
can be reasonably ignored, allowing us to estimate an uncorrected 
equilibrium constant at the ideal state {K°) from the simulation results. 
We also ran simulations for three a. values (1.4, 2.0, 2.6) and three M 
values (0.1, 1, 10ns). In addition, we ran 1% concentration simulations 
for three values (0.5, 0.25, 0.125nm). Each of these simulations was 
used to calculate a K" for the appropriate condition. Tables 1 and 2 
show the calculated K" from these simulation results. 

We then estimated apparent equilibrium constants as a function of 
total concentration for each set of parameters examined above. For 
tests of variation in a. and M, we examined two different situtations of 
crowding induced by 0.1 concentration of reactants (Cr) plus inert 



Table 2 | K° [counts~'m 2 ] for different d,h parameters from simu- 
lation results of 1% concentration, pure reactants. All the other 
parameter values are set to default values (8=0.7, M= 1 ns, 
D=6.95xl0 11 m 2 s 1 , a=2, and /?=1). The data was collected 
at 5 time points (5, 10, 15, 20, 25us) for 30 independent runs in 
a 500nm X 500nm simulation area. 



cL = 0.5nn 



0.25nm 



0.125nm 



2.46e-17 



4.94e-l 7 



9.37e-17 



crowding agents to yield the eight total initial concentrations: 0.1, 
0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45. As a secondary validation of the 
model outside the training conditions, we examined crowding 
induced purely by reactant particles at the same eight initial concen- 
trations: 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45. In both cases, apparent 
equilibrium constants for SOLM were calculated from simulations 
and the apparent equilibrium constants for unified regression model 
(URM) were estimated from Eq. (7). The apparent equilibrium con- 
stants from scaled particle theory (SPT) were calculated using calcu- 
lated K" from SOLM simulation of 1% pure reactant case and 
estimated correction factors based on the equations in 2DSPT for 
given parameters (concentration of reactants (C R ), concentration of 
inert crowding agents (Q), a, and ji) 1%2a . 

Figure 5 compares apparent equilibrium constants for crowding 
induced by adding inert crowding agents to fixed 0.1 C R and for pure 
reactants for varying M and a. URM, SOLM, and SPT all show an 
apparent equilibrium constant that nonlinearly increases with more 
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Figure 5 | Apparent equilibrium constants for SOLM, URM, and SPT for two different parameters (Mand a), (a) M=0. Ins. (b) M= Ins. (c) M= 10ns 
for 0.1 concentration of reactants with additional inert crowding agents, (d) M=0.1ns. (e) M= Ins. (f) M= 10ns for pure reactant simulations. All other 
parameter values are set to default values (B=0.7, D=6.95X10~ n m 2 s~ I , JS=1, and d,i, = 0.5nm). 
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compact dimers and higher total concentrations. In addition, the 
apparent equilibrium constant for URM, SOLM, and SPT increases 
as the mean time of dissociation events increases from 0.1ns to 10ns. 
The test case for the exclusion effect mainly from the inert crowding 
agent, shown in Fig. 5a,b,c, showed that URM more closely tracks the 
SOLM results than does SPT, consistent with the notion that the 
regression model is better able to mimic the more complicated inter- 
action patterns found in the full simulations than are the more 
simplified assumptions of the SPT model. However, tests involving 
high crowding induced by the reactant particles themselves rather 
than an inert crowding agent, shown in Fig. 5d,e,f, showed that SPT 
predicts the apparent equilibrium constant more closely to SOLM 
simulation results than URM does, especially in cases involving more 
compact dimers. These latter results demonstrate that the ability of 
the regression model to accurately capture the behavior of the par- 
ticle simulation can break down outside the parameter domain to 
which it was trained. 

As a further demonstration of the utility of the regression 
approach, we added an extension to the model to account for an 
additional parameter, d^, a threshold distance used by the simulator 
for identifying possible reaction events among nearby particles, as 
well as by the Green's function reaction dynamics (GFRD) algorithm 
to trigger new sampling of particle positions 1617 . This parameter can 
be considered to roughly capture the difference between an inter- 
action radius at which two particles can come close enough to influ- 
ence and a collision radius at which they would actually sterically 
interfere with one another. Figure 6 shows the apparent equilibrium 
constant for fixed 0.1 Cr with added inert crowing agents for three 
different fS values and three different d th values. The parameter effect 
of the threshold distance (d th ) is nonlinear and inseparable from 
other parameters that influence crowding, particularly total concen- 
tration (C). Because the URM was not trained to model d th , we added 
an additional multiplicative and separable term to the URM by per- 
forming a regression fit to a quadratic function in C and d th . The 
resulting term 

(5.72 - 6.1C- 18.03d,,, + 15.0904 -2.24C 2 + 16.4<4) 

provides the best quadratic fit for the given parameter range using a 
leave-one-out cross-validation. The figure plots predicted K eq for 
SOLM simulations, SPT corrections to low-concentration simula- 
tions, and the URM extended with the above C and d t ^ cross-term. 
There is no explicit SPT correction available for this para- 
meter, although the SPT results do reflect the change in the 



low-concentration equilibrium K° induced by changes in d t ^. All 
three models show similar non-linear behavior at the default d th of 
0.5nm. At d th = 0.25 nm, the extended URM provides a noticeably 
closer fit to the full particle simulations than does SPT, showing that 
even a crude and essentially physically naive regression model can 
reasonably capture the behavior of the full simulations. At 
drt = 0.125nm, results are more mixed, with the extended URM 
showing a better fit for much of the range of /i and C, although 
eventually breaking down at the most extreme conditions of high 
concentration and small crowding agents. The results thus demon- 
strate the ability of regression modeling to capture a physical effect 
for which we lack a good analytical model, although also the tend- 
ency of such regression models to break down as one approaches the 
limits of the parameter range to which they were trained. 

Incorporation of URM corrections into ordinary differential 
equation (ODE) simulations. The major goal of this work is to 
develop fast but versatile corrections for crowding effects that can 
be incorporated into more complex simulation models. To that end, 
we conducted a simple demonstration of how the URM models can 
be used in building efficient crowding corrections into fast crowding- 
free simulations. We simulated a homodimerization reaction, 
comparing SOLM particle simulations of the reaction, standard 
mass-action ODE simulation without crowding corrections, and a 
crowding-corrected ODE using our unified regression model 
(URM). Each SOLM particle simulation used default parameter 
values (£=0.7, M=lns, D=6.95X10~ n mV\ and oc=2, 0=1, 
d^=0.5nm) and was repeated 30 times with the average of the 30 
runs used at each time point for visualization. We repeated each 
simulation using an ODE model of the system lacking correction 
for any crowding effects, deriving a dissociation rate k„jj from the 
mean dissociation time M and an association rate k on from k 0 ff and 
the estimated idealized equilibrium constant K°. We then repeated 
each simulation using an ODE with k 0 g again derived from M but k on 
derived from M and the URM-predicted equilibrium constant 
K eq (C,B,M,D,a,[3) in Eq. (7). More details of the calculations are 
provided in Methods. Two sets of simulations were conducted, one 
using 0.2 total concentration (0.1 C R + 0.1 Q) and one using 0.3 total 
concentration (0.1 Cr + 0.2 Cj). All simulations were run for 1 us of 
simulation time. Figure 7 shows results for C=0.2 (Fig. 7a) and 
C=0.3 (Fig. 7b), highlighting 0.01 us for C=0.2 and 0.1 us for 0.3, 
sufficient to display both transient and pseudo-equilibrium behavior 
in each case. While both ODE models substantially deviate from the 
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Figure 6 | Apparent equilibrium constants for SOLM, URM, and SPT for two different parameters (/} and d^). (a) d,i, =0.5nm. (b) d,i, =0.25nm. 
(c) d t f, =0.125nmfor 0.1 concentration of reactants with additional inert crowding agents. All the other parameter values are set to default values (B=0.7, 
M=lns, Z>=6.95X10~ 11 mV 1 > and <x=2). 
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Figure 7 | Demonstration of URM as a method for adjusting for crowding effects in simulations. Each plot shows a comparison of a full SOLM model, 
an ODE model ignoring crowding effects, and an ODE model using the URM to adjust for crowding in the simulation. Each SOLM simulation uses 
default parameter values (B=0.7, M=lns, D=6.95X10 _11 m 2 s _1 , and a=2, /J=l, =0.5nm). Each SOLM curve shows an average over 30 repetitions, 
with error bars omitted for clarity. ODEs are simulated using estimated rate parameters from the quasi-ideal equilibrium constant in table 1 and default 
parameter value of M. URM simulations are simulated using the default value of Mto derive reverse rates and URM estimates of k eq to determine forward 
rates for the given M. (a) total concentration 0.2 (0.1 C R + 0.1 Cr). (b) total concentration 0.3 (0.1 Cr + 0.2 Q) 



full particle model in the early transient phase, the URM-corrected 
model adjusts to yield an equilibrium value consistent with the full 
SOLM particle model while the uncorrected ODE substantially 
understates the equilibrium produced by the full particle model. 
Table 3 shows run times for these sample simulations, as well as 
larger simulations quadrupling the simulation space and particle 
counts. As the table shows, both the crowding-naive and URM- 
corrected ODE models show comparable run times to one another, 
essentially independent of the scale of the problem. The SOLM par- 
ticle simulations, however, produce substantially longer run times 
than those of either ODE model in both conditions, and show 
approximately a 16-fold increase in run time as the system size is 
scaled four times. This quadratic increase in problem size for the 
particle methods is expected due to an approximately linear increase 
in work per event and linear decrease in mean waiting time between 
events with increasing problem size 16 . 

Discussion 

We have demonstrated that it is possible to establish a unified regres- 
sion model of binding equilibrium in crowded media that can in 
principle incorporate any physical parameters one can model in a 
full particle simulation, even when those parameters exhibit cross- 
dependencies for which we lack any predictive analytical theory. We 
have shown that this approach can accurately mimic a computation- 
ally costly particle model over a broad range of several such para- 
meters. We have in the process shown the feasibility of a strategy for 



quantitative simulation of reaction chemistry in crowded media that 
will maintain comparable accuracy to detailed particle simulations 
while enabling the use of much faster simulation methods. Our 
model is consistent with previous theoretical studies 20 in finding that 
total concentration (C) and size ratio parameters (a, /?) act non- 
linearly to affect the binding equilibrium constant, and further cap- 
tures both the cross-dependency of three key parameters (C, a, /?) on 
the model reaction system and the independent parameter effects of 
three other parameters (B, M, D). The correction factors and appar- 
ent equilibrium constants derived from our regression model gen- 
erally match well with those derived from scaled particle theory, 
demonstrating that our model captures the crowding effect for vari- 
ous conditions of C, a, and /?, reasonably. However, because the 
apparent equilibra and corrections of our regression model are based 
on empirical measurements from particle simulations, the approach 
will extend in a straightforward manner to a much broader range of 
parameters. We have further demonstrated how one can use this 
regression model as a fast correction for crowding effects in efficient 
ODE simulations. While the results of this corrected ODE model 
do not capture kinetics of the crowded system as well as the true 
particle simulation, they nonetheless provide an alternative that 
accurately captures the effects on system equilibrium of the full par- 
ticle simulations while exhibiting run times comparable that of a 
simple ODE. 

The principal alternatives for modeling chemistry in crowded 
conditions are the direct use of detailed particle models, similar to 
our SOLM but potentially far more involved in the range of para- 
meters and details of the physical model, and scaled particle theory, 
which uses simpler models for which one can analytically calculate 
the contribution of the excluded volume event to reaction equilibria. 
The regression approach is intended to balance the competing ben- 
efits of these two alternatives: particle models in allowing one to 
explore effects of many parameters with potentially complicated 
cross-dependencies and SPT in allowing one to quickly compute 
corrections for crowding. Particle models are computationally costly, 
making them unsuitable for extremely large systems, slow reactions, 
or very complicated models. SPT's major limitation is that it depends 
on one's ability to derive analytical models for the specific parameters 
or phenomena one wishes to consider. SPTs to date make various 
assumptions that one may wish to relax in practice, and such correc- 
tions may prove difficult to analyze. For example, both SPT and 
SOLM conventionally assume that individual particles are hard 
spheres in 3D and hard circles in 2D. This assumption has been 



Table 3 | Run time comparison for SOLM particle simulations, ODE 
simulations using URM corrections for crowding, and uncorrected 
ODE simulations. The table reports run times in seconds for 1 us 
simulations of dimerization reactions using default parameter 
values for each of the three models and for two system sizes and 
two crowding levels. 



C=0.2 O0.3 

i us (o. i c R + o. i q (o.i c R + 0.2 q 



lOOnmXlOOnm SOLM 3.13 10.33 

URM 0.36 0.38 

ODE 0.3 0.3 

200nmx 200nm SOLM 46.82 163.61 

URM 0.35 0.38 

ODE 0.31 0.29 



SCIENTIFIC REPORTS | 1 : 97 | DOI: 1 0. 1 038/srep00097 



8 



relaxed in various ways in the SPT literature. Rajagopalan's group 
extended from the hard-sphere interaction model of SPT to the 
nonbounded square-well interaction model 23 , and Mittal's group 
examined both repulsive interaction and attractive interaction of 
crowding particles 24-25 . In addition, Minton extended from a simple 
hard sphere model in SPT to two different particle models of a 
querying protein: Gaussian cloud and equivalent hard sphere to 
examine the excluded volume effect on more realistic conditions 26 . 
These are not trivial extensions, however, and are not easily made by 
users of the technology who are not experts in the underlying theory. 
By contrast, the results above demonstrate how a comparable exten- 
sion of the regression approach can be accomplished with no major 
conceptual changes to account for varying effects of an extra para- 
meter, d t h, that effectively controls a difference between the collision 
radius and interaction radius of reactant particles. While the essen- 
tially naive modeling that goes into such regression fitting has its own 
limitations and cannot replace good theory when that is available, it 
does provide a serviceable substitute for theory for systems too com- 
plex to allow for analytical solutions. Our regression approach thus 
provides a key step towards computationally practical simulations of 
complex reaction systems in crowded media and thus towards real- 
istic models of intracellular biochemistry. 

As our demonstration example of URM-corrected ODEs shows, 
the model exhibits essentially the same thermodynamic behavior as 
the full particle simulations. However, the corrected ODEs exhibit 
very different kinetics from the full particle simulations, more 
comparable to those of the ODE. We attribute this inaccuracy in 
transient behavior to the fact that the URM at present is fit only to 
thermodynamic data and thus leaves the rate constants underdeter- 
mined. The regression approach can in principle be extended to fit 
rate constants, not simply equilibrium constants, potentially yielding 
fast corrections to both kinetics and equilibrium but future work will 
be needed to explore the effectiveness of such regression corrections 
on kinetics. 

Even for thermodynamics, the equilibrium model is not a perfect 
representation of the system and additional challenges may be cre- 
ated by extending it to more parameters or unexplored regions of the 
parameter space. The accuracy of full six parameter model is 
decreased relative to that of the five-parameter models, which sug- 
gests that more accurate accounting of cross-effects may be needed. 
This observation may derive in part from our decision to omit cross 
afi terms in the regression equation to reduce the number of regres- 
sion parameters to a feasible level. It has previously been established 
that concentration acts nonlinearly on binding chemistry 9 14,20,23 . We 
can further conclude that allowing for cross-dependent parameters 
does require a higher order model of C than the third-order model 18 
that proved sufficient when capturing only the effects of separable 
parameters. Further increasing model complexity may be required to 
handle a richer model incorporating physical factors neglected in the 
present model. 

It is also important to note that although the motivation behind 
this work is to improve models of chemistry in the intracellular 
environment, the model itself is a highly simplified view of binding 
chemistry in crowded media. The true intracellular environment is 
far more complicated than our simple reaction model, containing 
various mixtures of different sizes of proteins and other molecules 
that are often irregularly shaped and involved in many forms of 
macromolecular complex or other complicated interaction patterns. 
Recent simulation work in the area has shown the value of far more 
complicated physical models, e.g., in the recent work of Kim and 
Yethiraj using Brownian dynamics to explore effects of crowding on 
ligand binding to membrane-bound receptors. 27 While our work has 
not been extended to such detailed models, it nonetheless shows that 
it is possible to produce a simple unified mathematical model that 
can reliably reproduce the results of qualitatively similar computa- 
tionally costly particle simulations for variations in several key 



parameters of such a system. Nonetheless, one would need to account 
for many other parameters to produce a quantitatively predictive 
model of a realistic intracellular medium or any particular reaction 
system within it. While no method, including ours, can yet approach 
that goal, the present work provides a new approach toward more 
complex simulations at any degree of abstraction. Another obvious 
extension of the model is to three dimensional systems. We chose to 
establish the regression approach on two dimensional systems prim- 
arily because it provides a simpler and more computationally tract- 
able framework in which to validate and demonstrate the proposed 
regression strategy. Two dimensional crowding does nonetheless 
have important applications in modeling association of membrane 
bound proteins 20 , for example in helping to explain observed patterns 
of clustering in the syntaxin-1 system 28 . While there are no appre- 
ciable conceptual obstacles to adapting the present methods to three 
dimensions, it remains to be shown that accuracy of the model would 
be comparable to that in 2D and that the increased computational 
cost of running particle simulations for training the regression 
models would not be prohibitive. 

In the future, we hope to expand on the general idea of fitting 
regression models to increasingly detailed particle simulations in 
order to develop more realistic and complicated reaction models 
suitable for producing quantitatively predictive simulations of reac- 
tion chemistry in vivo. A similar approach may prove valuable to a 
diversity of fields that similarly depend on computationally costly 
particle models, including various disciplines of biology, physics, 
chemistry, and engineering. 

Methods 

2DSOLM. Our previously developed two dimensional stochastic off-lattice model 
(2DSOLM) 16 was used to conduct all particle simulations described in the paper. A 
detailed description of the simulation program and its underlying algorithms is 
provided in our prior work' 6 . To effectively simulate some high concentration cases, a 
modified initialization protocol was used, as described in our prior research 18 . 
Initially, all reactants are monomers. Initialization of particle positions is performed 
by establishing a grid of potential particle positions at the maximum possible packing 
density for whichever of reactant monomers and crowding agents occupies the larger 
total area and then randomly inserting particles into the corresponding grid positions. 
This protocol was developed because it makes it possible to initialize in highly 
crowded conditions where independent uniform placement of particles would usually 
result in overlapping particles, such as for high concentration cases. The simulation 
program of 2DSOLM was implemented by C+ + programming language and run on 
a Linux Beowulf cluster. The collected data files were analyzed and plotted using 
Matlab (R2008a). Figures 2, 3, 4, 5, 6, 7, SI, S2, S3 were generated using Matlab. 
Simulation snapshots were plotted using Matlab and then simulation movies were 
made using Adobe Photoshop by concatenated these snapshots. The simulation code 
for 2DSOLM was released at the web site: http://www.cs.cmu.edu/-russells/projects/ 
crowding/ crowding.html. 

The specific simulations run and their parameters are as described in the Results. 

Calculating thermodynamic equilibrium constants by SPT. In calculating the 
thermodynamic correction factor (T exc =K e JK a ), we first assumed that 1% pure 
reactant case can be reasonably considered as an ideal state at which interaction 
among particles in such a diluted condition will be negligible 20 . For example, the 
calculated correction factor of 1% pure reactant case using simulation data in the 
default parameter value is 1.02, which is reasonably closed to the ideal state {T exc = 1). 
The simulation area was set to 500nm X 500nm for this condition to yield a large 
enough number of reactants to get accurate measurements for the given parameter 
conditions at 1% pure reactants case. The correction factor for SPT 

(J'exc — — e 21n?M ~ where y M and y D are activity coefficients of reactant 

monomer and dimer, respectively) was calculated by using the simulation data at 1% 
concentration to determine K° followed by SPT analytical approximation to the 
activity coefficient under crowded conditions, as described in the prior literature 
on scaled particle theory in two dimensional space 19,20 . The density of reactant 
monomers and dimers under any given condition was calculated by the following 
equations 18 : 

_4K eq p R + l- y /SK eq p R + l 



2 Pd„+Pm„=Pr, 

where p !{ = density of reactant monomers at initial state [particle counts/simulation 
area], p M ^ = density of reactant monomers at quasi-equilibrium state in the ideal 
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state, and p D ^ = density of reactant dimers at quasi- equilibrium state in the ideal 
state. K is then calculated by solving the equation K/r exc (K) — K° — 0 where 
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where p (density) = number of particles / simulation area, R — radius of a particle for 
each particle species: M (reactant monomer), D (reactant dimer), I (inert crowding 
particle) 19 ' 20 . Finally, the apparent equilibrium constant is calculated by multiplying 
the correction factor to K°, which is calculated from simulation results of 1% pure 
reactant cases for given parameter conditions. 

Simulation movie files. We made two movie files to demonstrate the simulation 
process and show the effects of the two cross -dependent parameters. Each movie 
contains two different moving images, which are distinguished by different values for 
a single parameter while all other parameters use the common default values. Video 
SI presents k— 1.4 vs. a— 2.6 for C= 0.45 and video S2 presents /3=0.4 vs. /?=1.6 for 
C— 0.4. The first half of each movie shows the system in the initial (pre-equilibration) 
state, and the second half of the movie shows a quasi- equilibrium state. Movie files 
illustrating the simulation progress and the effects of each parameter individually 
were provided for the previous independent parameters' 8 . High-resolution versions 
of the movies can be downloaded from: http://www.cs.cmu.edu/-russells/projects/ 
crowding/ crowding.html. 

Fitting K eq to simulation results. We calculated the equilibrium constant (K eq ) of the 
homodimerization reaction by solving for K eq as a function of the initial monomer 
concentration and mean steady-state dimer concentration. We then developed a 
regression model for K eq as a function of the simulation parameters by extending our 
prior regression model of four parameters (C, B, M, and D) up to six parameters 
(C, B, M, D, a, and fl). First, we used our prior model for the contributions of the 
parameters B, M, and D, which contribute linearly and independently to the 
equilibrium constant 18 , as shown in equation (4): 

K cll (B,M,D) = (^](^-)( ° 2 ,) (4) 



.95 X lO- 11 ^!^- 1 



We then built regression models of K eq (C,a), K eq (C,[l), and K eq (C,oc,[l) to include the 
parameter sets (a), (/?), and (a,/?). Because the three parameters (C,a,)3) are cross- 
dependent, they cannot each be separately incorporated into the regression model. 
Unified regression models for these three test cases are provided in equations (5-7): 

K eq (CMM,D,a)=K tq (C,a)K eq (B,M,D) (5) 



K eq {C,B,M,D,l3)=K cq (C,P)K eq (B,M,D) 
Ke q (C,B,M,D,ai,P)=K eq (C,oi,P)K eil (B,M,D) 



(6) 
(7) 



To find the best-matched regression model for each of the three cases, we per- 
formed least-squares polynomial regression for each parameter set, beginning with 
first-order regression models and increasing the degree until it was not possible to 
increase further without the number of regression parameters exceeding the number 
of data points. The general forms for the three regression models of K eq {C,c/.), 
K cq {C$), and K eq (C,y.,P) are as follows: 

K eq (C,u) = a 0 + (aiC + a 2 a) + (a 3 Ca-\- a 4 C 2 + a 5 a 2 ) + ■ ■ ■ (8) 



K eq (C$) = a Q + {a l C + a 2 li) + (a 3 CP + a A C 2 + a 5 p 2 ) + 

K eq (C,y,^) — a Q -\-(aiC-\-a 2 ot + a 3 p) 

+ (a 4 Ca + a 5 Cfi-\-a 6 C 2 -\-a 7 ci 2 + a & (3 2 } + ■ 



(9) 



(10) 



Coefficients for the polynomials were calculated by the least-squares polynomial 
regression. For the polynomial K eq (C,y,fi), we neglected the cross a/? terms to reduce 
the number of coefficients of the model, on the hypothesis that these terms will have a 
negligible effect beyond the effects for which Cot and Cfi terms account. We selected 
the best-matched polynomial across degrees by minimizing the least squares error 
using a leave-one-out cross validation test for each of the three cases. 



Extension of the URM to account for threshold distance (d t h). An extension to the 
URM to account for effects of d t h was developed by adapting the same protocol from 
above to learn an additional multiplicative factor to the standard URM. The extension 
assumed that the previous full URM K eq (C,y,ft) could be corrected to account for both 
the direct influence of d t h and the cross -dependence of d f / ( and C with a multiplicative 
term learned by polynomial regression on d^, and C. Simulations for three values of 
d th (0.5, 0.25, 0.125 nm) and three values of ft (0.4, 1.0, 1.6) were fit to a series of 
polynomial models of the form 

r ( C,dth ) = a o + (a i C + a 2 d lh ) + (a 3 Cd th + a 4 C 2 + a 5 d 2 h ) + L 

Least-squares regression with leave-one-out cross-validation found a second order 
model to yield the lowest cross-validated error. These fits were then used to create an 
extended URM, K eq (C,a,p,d th ) = Y(C4ihWe q (C,a^), used to evaluate the ability of 
the regression approach to incorporate additional parameters for which no equivalent 
SPT theory is available. 

Use of URM as a correction to ODE simulations. To demonstrate of the use of the 

URM as a means of quickly providing crowding corrections for fast space-free 
simulations, we conducted a set of demonstration simulations using the URM to 
adjust rates to a fast ODE model. We compared this URM-corrected ODE model to 
full particle simulations with SOLM and to crowding- naive ODEs. SOLM simulations 
were conducted using default parameter values (5—0.7, M— 1 ns, 
D=6.95X10- 11 mV\ and a=2, f$=l, d th -0.5 nm, 100 nm X 100 nm space) for 
0.1 Q + 0.1 Q and 0.1 C fi + 0.2 Q. Each simulation was repeated 30 times with the 
resulting values averaged at each time point for visualization. Comparable ODE 
models were instantiated by assuming the dissociation rate is the inverse of 
dissociation time M and that association time k on is related to the idealized equilib- 
rium constant K° by the formula 

K° for each simulation was estimated for the idealized no-crowding condition from 
1% concentration SOLM simulations, as in table 1. URM-corrected ODEs were 
derived using an ODE model with dissociation rate k 0 ^ again set to M" 1 but k on 
recalculated at each step of the ODE using the URM from the formula: 

k on = K eq {C,B,MVAMog 

The model is thus corrected so as to yield URM-derived equilibrium values. 
Simulations were run for 1 us for each model and concentration. As an additional test 
of run time scaling of the model, we repeated each simulation under identical con- 
ditions but using a 200 nm X 200 nm simulation space, with particle counts scaled 
proportionately to maintain concentrations at C— 0.2 or C— 0.3. Run times were 
recorded for the ode45 call for both ODE and URM simulations and for the full 
execution of the SOLM model for particle simulations. 

Both ODE models were simulated with Matlab 7.7.0, R2008b using the ode45 
embedded Runge-Kutta integration method. SOLM was implemented using C+ + 
and run in GNU bash version 3.2.9 (1), release (x86-64-redhat-linux-gnu). All run 
time tests were run on a single Intel (R) Core (TM) 2 Quad CPU Q9550 @ 2.83GHz, 
4GB RAM workstation using Linux Fedora release 7 (Moonshine). 



1 . Minton, A. P. How can biochemical reactions within cells differ from those in test 
tubes? /. Cell Sci. 119, 2863-2869 (2006). 

2. Tokuriki, N. et al. Protein folding by the effects of macromolecular crowding. 
Protein Sci. 13, 125-133 (2004). 

3. Hatters, D. M., Minton, A. P. & Howlett, G. J. Macromolecular crowding 
accelerates amyloid formation by human apolipoprotein C-II. /. Biol. Chem. 277, 
7824-7830 (2002). 

4. Jiang, M. & Guo, Z. Effects of macromolecular crowding on the intrinsic catalytic 
efficiency and structure of enterobactin- specific isochorismate synthase. /. Am. 
Chem. Soc. 129, 730-731 (2007). 

5. Grima, R. & Schnell, S. A systematic investigation of the rate laws valid in 
intracellular environments. Biophys. Chem. 124, 1-10 (2006). 

6. Eide, J. L. & Chakraborty, A. K. Effects of quenched and annealed macromolecular 
crowding elements on a simple model for signaling in T lymphocytes. /. Phys. 
Chem. B 110, 2318-2324 (2006). 

7. Pincus, D. L. & Thirumalai, D. Crowding effects on the mechanical stability and 
unfolding pathways of ubiquitin. /. Phys. Chem. B 113, 359-368 (2009). 

8. Christiansen, A. et al. Factors defining effects of macromolecular crowding on 
protein stability: an in vitro/in silico case study using cytochrome c. Biochemistry 
49,6519-6530 (2010). 

9. Zimmerman, S. B. & Minton, A. P. Macromolecular crowding: biochemical, 
biophysical, and physiological consequences. Annu. Rev. Biophys. Biomol. Struct. 
22, 27-65 (1993). 

10. Hall, D. & Minton, A. P. Macromolecular crowding: qualitative and 
semiquantitative successes, quantitative challenges. Biochim. Biophys. Acta 1649, 
127-139 (2003). 

1 1 . LeDuc, P. R. 8c Schwartz, R. Computational models of molecular self- organization 
in cellular environments. Cell Biochem. Biophys. 48, 16-31 (2007). 

12. Mayawala, K., Vlachos, D. G. & Edwards, J. S. Spatial modeling of dimerization 
reaction dynamics in the plasma membrane: Monte Carlo vs. continuum 
differential equations. Biophys. Chem. 121, 194-208 (2006). 



SCIENTIFIC REPORTS | 1 : 97 | DOI: 1 0. 1 038/srep00097 



10 



13. Puskar, K., Ta'asan, S., Schwartz, R. & LeDuc, P. R. Evaluating spatial constraints 
in cellular assembly processes using a Monte Carlo approach. Cell Biochem. 
Biophys. 45, 195-201 (2006). 

14. Elcock, A. H. Atomic-level observation of macromolecular crowding effects: 
escape of a protein from the GroEL cage. Proc. Natl Acad. Sci. USA 100, 2340- 
2344 (2003). 

15. Izvekov, S. & Voth, G. A. A multiscale coarse-graining method for biomolecular 
systems. /. Phys. Chem. B 109, 2469-2473 (2005). 

16. Lee, B., LeDuc, P. R. & Schwartz, R. Stochastic off-lattice modeling of molecular 
self-assembly in crowded environments by Green's function reaction dynamics. 
Phys. Rev. E 78, 031911 (2008). 

17. von Zon, J. S. & ten Wolde, P. R. Green's-function reaction dynamics: a particle- 
based approach for simulating biochemical networks in time and space. /. Chem. 
Phys. 123, 234910 (2005). 

18. Lee, B., LeDuc, P. R. & Schwartz, R. Parameter effects on binding chemistry in 
crowded media using a two-dimensional stochastic off-lattice model. Phys. Rev. E 
80, 041918 (2009). 

19. Lebowitz, J. L., Helfand, E. & Praestgaard, E. Scaled particle theory of fluid 
mixtures. /. Chem. Phys. 43, 774-779 (1965). 

20. Grasberger, B., Minton, A. P., DeLisi, C. & Metzger, H. Interaction between 
proteins localized in membranes. Proc. Natl. Acad. Sci. USA 83, 6258-6262 
(1986). 

21. Bicknese, S., Periasamy, N., Shohet, S. B. & Verkman, A. S. Cytoplasmic viscosity 
near the cell plasma membrane: measurement by evanescent field frequency- 
domain microfluorimetry. Biophys. J. 65, 1272-1282 (1993). 

22. Sengers, J. V. & Watson, J. T. R. Improved international formulations for the 
viscosity and thermal conductivity of water substance. /. Phys. Chem. Ref. Data 15, 
1291-1314 (1986). 

23. Hu, Z., Jiang, J. & Rajagopalan, R. Effects of macromolecular crowding on 
biochemical reaction equilibria: a molecular thermodynamic perspective. 
Biophys. }. 93, 1464-1473 (2007). 

24. Kim, Y. C, Best, R. B & Mittal, J. Macromolecular crowding effects on protein- 
protein binding affinity and specificity. /. Chem. Phys. 133, 205101 (2010). 

25. Rosen, J., Kim, Y. C. & Mittal, J. Modest protein-crowder attractive interactions 
can counteract enhancement of protein association by intermolecular excluded 
volume interactions. /. Phys. Chem. B. 115, 2683-2689 (2011). 



26. Minton, A. P. Models for excluded volume interaction between an unfolded 
protein and rigid macromolecular cosolutes: macromolecular crowding and 
protein stability revisited. Biophys. J. 88, 971-985 (2005). 

27. Kim, J. S. & Yethiraj, A. Crowding effects on association reactions at membranes. 
Biophys. J. 98, 951-958 (2010). 

28. Sieber, J. J. et at Anatomy and dynamics of a supramolecular membrane protein 
cluster. Science 317, 1072-1076 (2007). 



Acknowledgments 

We thank Dr. Robert Murphy and the Center for Bioimage Informatics at Carnegie Mellon 
University for providing the use of their computer cluster for this research. This work was 
supported by National Institutes of Health NIAID award #1R01AI076318 (RS), the 
National Science Foundation (CMMI-0856187, CMMM013748) and the Office of Naval 
Research (N000140910215) (PRL). 

Author contributions 

B.L., P.R.L. and R.S. designed the stochastic off-lattice model and the algorithms of the 
simulation program and participated in planning of simulation experiments. B.L. 
implemented and ran the simulation code and analyzed the simulation data. B.L., P.R.L. and 
R.S. wrote and reviewed the manuscript. 

Additional information 

Supplementary information accompanies this paper at http://www.nature.com/ 
scientificreports 

Competing financial interests: The authors declare no competing financial interests. 
License: This work is licensed under a Creative Commons 

Attribution -Noncommercial- Share Alike 3.0 Unported License. To view a copy of this 
license, visit http://creativecommons.org/licenses/by-nc-sa/3-0/ 

How to cite this article: Lee, B., LeDuc, P.R. & Schwartz, R. Unified regression model of 
binding equilibria in crowded environments. 5c/. Rep. 1, 97; DOI:10.1038/srep00097 
(2011). 



SCIENTIFICREPORTS | 1 : 97 | DOI: 1 0. 1 038/srep00097 



